Goto

Collaborating Authors

 sampling inmeta-reinforcement learning


The Importance of Sampling inMeta-Reinforcement Learning

Neural Information Processing Systems

We interpret meta-reinforcement learning as the problem of learning how to quickly find a good sampling distribution in a new environment. This interpretation leads to the development of two new meta-reinforcement learning algorithms: E-MAML and E-$\text{RL}^2$. Results are presented on a new environment we call `Krazy World': a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning. Further results are presented on a set of maze environments. We show E-MAML and E-$\text{RL}^2$ deliver better performance than baseline algorithms on both tasks.


Reviews: The Importance of Sampling inMeta-Reinforcement Learning

Neural Information Processing Systems

The paper shows the importance of the used training setup for MAML and RL 2. A setup can include "exploratory episodes" and measure the loss only on the next "reporting" episodes. The paper presents interesting results. The introduced E-MAML and E-RL 2 variants clearly help. The main problem with the paper: The paper does not define well the objective. I only deduced from the Appendix C that the setup is: After starting in a new environment, do 3 exploratory episodes and report the collected reward on the next 2 episodes.


The Importance of Sampling inMeta-Reinforcement Learning

Stadie, Bradly, Yang, Ge, Houthooft, Rein, Chen, Peter, Duan, Yan, Wu, Yuhuai, Abbeel, Pieter, Sutskever, Ilya

Neural Information Processing Systems

We interpret meta-reinforcement learning as the problem of learning how to quickly find a good sampling distribution in a new environment. This interpretation leads to the development of two new meta-reinforcement learning algorithms: E-MAML and E-$\text{RL} 2$. Results are presented on a new environment we call Krazy World': a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning. Further results are presented on a set of maze environments. We show E-MAML and E-$\text{RL} 2$ deliver better performance than baseline algorithms on both tasks.